Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Vector-based segmentation of text connected to graphics in engineering drawings

Identifieur interne : 002719 ( Main/Exploration ); précédent : 002718; suivant : 002720

Vector-based segmentation of text connected to graphics in engineering drawings

Auteurs : Dov Dori [Israël] ; Liu Wenyin [Israël]

Source :

RBID : ISTEX:F677A4D1219B991491A242E4B4B6947E5195CE0D

Descripteurs français

English descriptors

Abstract

Abstract: A method for segmentation of text that may be connected to graphics in engineering drawings is presented. It consists of three steps: growing individual characterbox regions, using a recursive merging scheme by stroke linking; merging the detected characterboxes into a textbox and determining its orientation; and re-segmenting the textbox back into the refined characterbox that can be input to an OCR subsystem. The method can segment dimensioning text as well as other classes of text. It handles both isolated and touching characters, aligned at any slant. The capability of segmenting characters that touch either themselves or graphics, which is an important feature in handling real life drawings, is obtained by focusing on intermediate vector information rather that on the raw pixel data. We present the details of the algorithm and show both successful and unsuccessful examples from an experimental set of 36 dimensioning textboxes, in which 94% segmentation rate was achieved with 3% false alarm rate.

Url:
DOI: 10.1007/3-540-61577-6_33


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Vector-based segmentation of text connected to graphics in engineering drawings</title>
<author>
<name sortKey="Dori, Dov" sort="Dori, Dov" uniqKey="Dori D" first="Dov" last="Dori">Dov Dori</name>
</author>
<author>
<name sortKey="Wenyin, Liu" sort="Wenyin, Liu" uniqKey="Wenyin L" first="Liu" last="Wenyin">Liu Wenyin</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:F677A4D1219B991491A242E4B4B6947E5195CE0D</idno>
<date when="1996" year="1996">1996</date>
<idno type="doi">10.1007/3-540-61577-6_33</idno>
<idno type="url">https://api.istex.fr/document/F677A4D1219B991491A242E4B4B6947E5195CE0D/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000E13</idno>
<idno type="wicri:Area/Istex/Curation">000D80</idno>
<idno type="wicri:Area/Istex/Checkpoint">001B30</idno>
<idno type="wicri:doubleKey">0302-9743:1996:Dori D:vector:based:segmentation</idno>
<idno type="wicri:Area/Main/Merge">002863</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:97-0020290</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000973</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000A25</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000920</idno>
<idno type="wicri:doubleKey">0302-9743:1996:Dori D:vector:based:segmentation</idno>
<idno type="wicri:Area/Main/Merge">002A07</idno>
<idno type="wicri:Area/Main/Curation">002719</idno>
<idno type="wicri:Area/Main/Exploration">002719</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Vector-based segmentation of text connected to graphics in engineering drawings</title>
<author>
<name sortKey="Dori, Dov" sort="Dori, Dov" uniqKey="Dori D" first="Dov" last="Dori">Dov Dori</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Israël</country>
<wicri:regionArea>Faculty of Industrial Engineering and Management, Technion-Israel Institute of Technology, 32000, Haifa</wicri:regionArea>
<wicri:noRegion>Haifa</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Israël</country>
</affiliation>
</author>
<author>
<name sortKey="Wenyin, Liu" sort="Wenyin, Liu" uniqKey="Wenyin L" first="Liu" last="Wenyin">Liu Wenyin</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Israël</country>
<wicri:regionArea>Faculty of Industrial Engineering and Management, Technion-Israel Institute of Technology, 32000, Haifa</wicri:regionArea>
<wicri:noRegion>Haifa</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Israël</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>1996</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">F677A4D1219B991491A242E4B4B6947E5195CE0D</idno>
<idno type="DOI">10.1007/3-540-61577-6_33</idno>
<idno type="ChapterID">33</idno>
<idno type="ChapterID">Chap33</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Computer aided design</term>
<term>Engineering</term>
<term>Graphics</term>
<term>Industrial drawing</term>
<term>Segmentation</term>
<term>Text</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Conception assistée</term>
<term>Dessin industriel</term>
<term>Ingénierie</term>
<term>Reconnaissance caractère</term>
<term>Représentation graphique</term>
<term>Segmentation</term>
<term>Texte</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: A method for segmentation of text that may be connected to graphics in engineering drawings is presented. It consists of three steps: growing individual characterbox regions, using a recursive merging scheme by stroke linking; merging the detected characterboxes into a textbox and determining its orientation; and re-segmenting the textbox back into the refined characterbox that can be input to an OCR subsystem. The method can segment dimensioning text as well as other classes of text. It handles both isolated and touching characters, aligned at any slant. The capability of segmenting characters that touch either themselves or graphics, which is an important feature in handling real life drawings, is obtained by focusing on intermediate vector information rather that on the raw pixel data. We present the details of the algorithm and show both successful and unsuccessful examples from an experimental set of 36 dimensioning textboxes, in which 94% segmentation rate was achieved with 3% false alarm rate.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Israël</li>
</country>
</list>
<tree>
<country name="Israël">
<noRegion>
<name sortKey="Dori, Dov" sort="Dori, Dov" uniqKey="Dori D" first="Dov" last="Dori">Dov Dori</name>
</noRegion>
<name sortKey="Dori, Dov" sort="Dori, Dov" uniqKey="Dori D" first="Dov" last="Dori">Dov Dori</name>
<name sortKey="Wenyin, Liu" sort="Wenyin, Liu" uniqKey="Wenyin L" first="Liu" last="Wenyin">Liu Wenyin</name>
<name sortKey="Wenyin, Liu" sort="Wenyin, Liu" uniqKey="Wenyin L" first="Liu" last="Wenyin">Liu Wenyin</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002719 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002719 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:F677A4D1219B991491A242E4B4B6947E5195CE0D
   |texte=   Vector-based segmentation of text connected to graphics in engineering drawings
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024